\documentclass[11pt]{article} % For LaTeX2e
\usepackage{rldmsubmit,palatino}
\usepackage{graphicx}
\usepackage{hyperref}

\title{A More Robust Way of Teaching Reinforcement Learning and Decision Making}

\author{
Miguel Morales\thanks{http://www.mimoralea.com} \\
Department of Computer Science \\
Georgia Institute of Technology \\
Atlanta, GA 30332 \\
\texttt{mimoralea@gatech.edu} \\
}

\newcommand{\fix}{\marginpar{FIX}}
\newcommand{\new}{\marginpar{NEW}}

\begin{document}

\maketitle

\begin{abstract}
  I propose a new way of teaching reinforcement learning and decision making
  that is designed to be an improvement to traditional academic teaching. I use
  a three-step approach to delivering a complete learning experience in a
  way that engages the student and allows them to grasp the concepts regardless
  of their skill level. I present a specific way of teaching the content, a
  new and fully configured coding platform, a set of hands-on exercises and
  a group of recommended next steps for deeper learning.
\end{abstract}

\keywords{teaching tutorials jupyter intuition hands-on}
\repository{https://www.github.com/mimoralea/applied-reinforcement-learning}
\spresentation{https://youtu.be/ltjS5ktziLQ}
\lpresentation{https://youtu.be/1WjNj_JmFaE}

\acknowledgements{
  I am thankful to my mentor, Kenneth Brooks, for providing assistance when
  navigating the field of Educational Technology. Also, for giving direct,
  concise and clear feedback on how to make this project better. Thank you
  to all my peers who also provided sincere feedback throughout the semester.
  I hope to see you all enjoying our OMSCS course in Reinforcement Learning
  and Decision Making, but not before going through these lessons. It will
  be a rewarding experience. Pun intended.
}

\startmain % to start the main 1-4 pages of the submission.

\section{Introduction}

Reinforcement Learning and Decision Making is a complex subject. Being the
focus of research of a variety of fields including artificial intelligence,
psychology, machine learning, operations research, control theory, animal
and human neuroscience, economics, and ethology, it is expected that the
vast amount of available information could become counterproductive if not
handled properly. Beginners often find themselves lost while trying to grasp
the key concepts that are truly vital for understanding. Additionally, reinforcement
learning and decision making, being a relatively new field, is often taught by
world-class researchers that frequently unintentionally omit explaining
core concepts that might seem too basic \cite{gapranda}, yet remain
fundamental. This creates a gap of knowledge that, if left unfilled, causes
trouble when learning the more advanced topics.

These points presents some of the challenges of sparking an interest and keeping the
students engaged throughout their entire learning experience. If the content is not
delivered correctly, the students can quickly feel confused, lost and disengaged, and
when that happens learning stops.

\section{Sparking Curiosity}

Fortunately, since reinforcement learning and decision making is studied
by fields like animal and human neuroscience, ethology, and psychology\cite{suttons98},
often the concepts can be taught on an direct way using ordinary examples
in order to connect on an intuitive level. Recent studies in neuroscience have
shown that emotions and cognition are interrelated\cite{intuition}. By keeping
the readings approachable I allow students to connect to the narratives at different
levels. The notion of learning by interacting with an environment should be easy enough
to understand for all of us, as this is one of the ways we learn. Reinforcement
learning in Artificial Intelligence has serveral similarities with Human learning.

I leverage this fact and use a strategy to keep the readers engaged in the
material.

\subsection{Using Simple And Direct Language}

Another important component I accomplished is to use simple and direct
language throughout the documents. This keeps the reader engaged regardless
of their reinforcement learning knowledge level.

I carefully select words and examples that bring the concepts to a
common sense understanding so that all students can follow the initial
readings.

\subsection{Keeping A Single Narrative}

Additionally, and what was perhaps the most difficult part, I keep a single
narrative throughout the sequence of concepts being presented. The intention
here is to allow students to continue reading and use the understanding they
accumulate in previous lessons to understand the subsequent lessons. Similar to what
the direct instruction paradigm\cite{directinstruction} encourages, one of the
most important work on this project is providing with the structure and sequence
on how the concepts are presented.

The more traditional approach is to select concepts from the entire body of
reinforcement learning and decision making and use different lessons to present
different material. However, the problem with this approach is that it does
not help the student grasp the complete picture or the connection between other topics.
The effort to present concepts in logical sequence, despite being complex to define
initially, not only feels more natural to present to beginners, but it helps
beginners stay engaged in the material while they continue learning concepts.

\subsection{Showing Concepts And Their Complement}

Finally, in order to spark and maintain the students' curiosity, I show the
full spectrum of a single concept. Even if just defining the opposite side, I 
still make an effort to mention it and briefly explain it. Often things in
life have a complementary side, that when combined can better show the qualities
of one another. For example, explaining deterministic actions is interesting
all by itself, but you could gain a much better understanding if I explained
them along side stochastic actions. This approach is also known as Compare and
Contrast, and the literature suggests that teaching comparative thinking
strengthens student learning\cite{compare}.

I paid close attention to show concepts and their complements in every
lesson. The expectation is that this would help the students have a better
sense of the full range of possibilities for any given point. Keeping concepts
in this format, keeps students engaged, as concepts get progressively more
and more complex.

\section{Removing Friction}

Once the students' curiosity has been sparked and intuition is engaged, a
convenient way to interact with the concepts should be presented. The
friction of getting hands-on experience is one of the most difficult
barriers to break for beginners, but once this is past, the student can
much better understand the concepts.

I worked on three different important points to fully remove the friction
beginners have when first getting into reinforcement learning.

\subsection{Setting Up A Convenient Environment}

One of the most remarkable accomplishments of this project is the creation
of a fully configured reinforcement learning platform to use OpenAI
Gym \cite{openaigym} environments on Jupyter Notebooks inside of Docker
containers.

Besides technicalities, having a ready-to-go environment that can help
students be ready to go within 20 min wait time for the first run after
copy-paste of provided commands is wonderful. After that initial setup
it takes less than 1 min every subsequent run. This allows the student to spend
only a minimal amount of time configuring and battling with packages and configuration
scripts, that do not add knowledge in reinforcement learning; and allows them to concentrate
 all the effort on concepts that truly matter.

\subsection{Providing With Boilerplate Code}

Moreover, I supplement the notebooks with abundant boilerplate code. Graphs and
visualization functions that very likely aid in the learning process \cite{visualization},
binaries creating web requests in the background to show videos of carefully
selected agent episodes are some of the examples of
code provided to the students.

This allows the students to interact only with bits of code that are directly
related to reinforcement learning, and be able to safely ignore other bits.

\subsection{Asking For Minimal Effort}

Then, I proceed to ask students to put just enough effort to get them engaged.
The hands-on interaction with the notebooks are designed for beginner to get
started with reinforcement learning. Perhaps, these students have not seen
reinforcement learning or even machine learning code in action before. Therefore,
in addition to all of the boilerplate material already mentioned, I also
provide the most common algorithms in each of the notebooks, and
only ask the students to complete small sections that would make the core
algorithms work more effectively.

The idea is that after they have contact with reinforcement learning code, they
will have more confidence when interacting with more advanced problems and
projects during the OMSCS course.

\section{Showing Options}

Lastly, connecting to intuition and getting hands-on experience will be
futile unless the students have a new interest of exploring the field by
themselves. This is the most important aspect of our project, I believe
education is about motivation. The role of an instructor is merely to spark
students curiosity and help them find the path to their own realization.

Therefore, at this point I hope to have awakened the students' interest to
explore this marvelous field. Now, showing the path for further learning is
a final and very important step.

\subsection{Assigning Relevant Readings}

To help the students better navigate the field of reinforcement learning,
I provide with ``Further Reading'' sections in every single lesson, and a
single final section of ``Recommended Books'' at the end of the project. The fact
that I teach the concepts in a direct and simple language is by no means
an indication that academic material can be skipped. Actually, the way I 
present the material should be seen as a \emph{primer}, helping the
concepts later presented come together more naturally, and be absorbed quicker.

\subsection{Watching Academic Lectures}

Next, I would hope students go on to watch academic lectures.
To have world-class experts in the field of reinforcement learning teaching
concepts that they are wholesomely familiar with and have been studying and working with, 
 is necessary for the students. For this reason, I added a ``Recommended Courses''
sections for students to continue the search and learning on their own.

\subsection{Completing Homework and Projects}

Finally, I would hope that many of the students using these materials are
the same students who either are planning to enroll for the OMSCS course or have just
enrolled. The OMSCS course, after a brief explanation of core concepts,
shows very advanced concepts, at a very rapid pace. In addition, there are specially
designed homework and project assignments so that the students get a solid
grasp of reinforcement learning.

Completing the coursework would certainly put the students in the driving seat
making them owners of their destiny and letting them pick wisely what reinforcement
learning area to explore next. 

\section{Future Work}

No work is perfect and neither is this one. However, for the
{\raise.17ex\hbox{$\scriptstyle\sim$}}2 months of effort
put into it, I think the progress that has been made is incredible. I started
with an aggressive proposal and delivered on most of it. I kept progress steady, but
flexible enough to adapt along the way, while still completing core components.
The lessons, the container, the notebooks, the assigned readings, the recommended
courses, all provide with a solid foundation for the deep understanding of
reinforcement learning and decision making.

It is this foundation that can now make further progress easier to achieve. After
opening this work to the community during the summer semester, I hope to receive
help and feedback to make this project even better going forward.

\subsection{Additional Notebooks}

An important component for future work is the addition of notebooks. I had the capability
to complete seven notebooks, but while trying to rush in some final work, I noticed
the quality of the later notebooks were seriously degrading the quality of the
project. Instead of pushing onto additional notebooks, I opted for improving the quality
of previous notebooks and leaving the newer projects out of this release.

This creates an opportunity for re-adding those notebooks that were removed and
improving them considerably. Also, the addition of new notebooks would be of
great benefit as well.

\subsection{Effectiveness Evaluation}

A more difficult future work component would be to find a way to measure the effectiveness
of this material. Ideally, an Educational Technology student can take on the task
to research whether the strategy presented here actually improves student
performance. It would be interesting to gather and study this kind of feedback.

\subsection{Request For Feedback}

Finally, one of the next steps I will be taking on is to release this project
in different places. First, to previous students on the Slack channel of the
Georgia Tech Study Group organization. These folks are now veterans of our course
and would be a great source of feedback. Second, I will release to the OpenAI
community through their discussion forums in an attempt to get a very diverse group
to review and provide feedback. The expectation is that this feedback will be
followed up with actual changes in the form of GitHub pull requests. This, and only
this, would make this the project I initially envisioned. 

\section{Conclusion}

In this paper, I proposed a more robust way of teaching reinforcement learning
and decision making. I presented a series of lessons taught in a very specific
format, I delivered a fully-configured coding environment for the development
of reinforcement learning agents and algorithms, I provided with boilerplate
code and a series of notebooks to assist with hands-on experimentation, and I
supplemented this with more academic readings, and lectures.

I sincerely hope this project will be useful to lots of people interested in
learning the ins and outs of reinforcement learning and decision making. And,
in fact, the project recently helped an OMSCS Reinforcement Learning and
Decision Making student find his way around the complex topic of function
approximation in reinforcement learning. The potential, however, is bigger, and
the path to improvement obvious in some cases. My desire is to see this
work continue to grow into a more mature and effective way of teaching this
amazing field.

\medskip
 
\begin{thebibliography}{9}

\bibitem{gapranda}
  Ferguson, Julie E.
  \textit{Bridging The Gap Between Research and Practice}
  KM4Dev, Volume 1(3), 46-54.
 
\bibitem{suttons98}
  Richard Sutton, Andrew Barto.
  \textit{Reinforcement Learning: An Introduction.}
  MIT Press, 1998.
 
\bibitem{intuition}
  Maray Immordino-Yang, Matthias Faeth.
  \textit{Building Smart Students: A Neuroscience Perspective on the Role
    of Emotion and Skilled Intuition in Learning.}
  Bloomington, 2010.
 
\bibitem{directinstruction}
  Baumann, James F.
  \textit{The effectiveness of a direct instruction paradigm for teaching main idea comprehension.}
  Reading Research Quarterly (1984): 93-115.
 
\bibitem{compare}
  Silver, Harvey F.
  \textit{Compare and Contrast.}
  Strategic Teacher PLC Guides, 2010.
 
\bibitem{visualization}
  Naps, Thomas L., et al.
  \textit{Exploring the role of visualization and engagement in computer science education.}
  ACM Sigcse Bulletin. Vol. 35. No. 2. ACM, 2002.
 
\bibitem{openaigym}
  Greg Brockman, Vicki Cheung, Ludwig Pettersson et al.
  \textit{OpenAI Gym.}
  ArXiv, 1606.01540, 2016.
 
\end{thebibliography}

\end{document}
